A Probabilistic Approach to Persian Ezafe Recognition
نویسندگان
چکیده
In this paper, we investigate the problem of Ezafe recognition in Persian language. Ezafe is an unstressed vowel that is usually not written, but is intelligently recognized and pronounced by human. Ezafe marker can be placed into noun phrases, adjective phrases and some prepositional phrases linking the head and modifiers. Ezafe recognition in Persian is indeed a homograph disambiguation problem, which is a useful task for some language applications in Persian like TTS. In this paper, Part of Speech tags augmented by Ezafe marker (POSE) have been used to train a probabilistic model for Ezafe recognition. In order to build this model, a ten million word tagged corpus was used for training the system. For building the probabilistic model, three different approaches were used; Maximum Entropy POSE tagger, Conditional Random Fields (CRF) POSE tagger and also a statistical machine translation approach based on parallel corpus. It is shown that comparing to previous works, the use of CRF POSE tagger can achieve outstanding results.
منابع مشابه
Automatic Selection of Reference Pages in Wikipedia for Improving Targeted Entities Disambiguation
A 59 A Knowledge-based Representation for Cross-Language Document Retrieval and Categorization Marc Franco-Salvador, Paolo Rosso and Roberto Navigli A 10170 A Probabilistic Approach to Persian Ezafe Recognition Habibollah Asghari, Heshaam Faili and Jalal Maleki A 10137 Acquiring a Dictionary of Emotion-Provoking Events Hoa Trong Vu, Graham Neubig, Sakriani Sakti, Tomoki Toda and Satoshi Nakamur...
متن کاملA Hybrid Algorithm for Recognizing the Position of Ezafe Constructions in Persian Texts
17- Abtract — In the Persian language, an Ezafe construction is a linking element which joins the head of a phrase to its modifiers. The Ezafe in its simplest form is pronounced as –e, but generally not indicated in writing. Determining the position of an Ezafe is advantageous for disambiguating the boundary of the syntactic phrases which is a fundamental task in most natural language processi...
متن کاملPersian Ezafe as a 'figure' Marker: a Unified Analysis
This article is a conceptual exploration of Ezafe in Modern Persian. I will consider cases where Ezafe seems to be conceptually non-neutral. In certain cases of the ‘X-e Y’ construction, X and Y can change their places with a shift in meaning while they are apparently frozen in their positions in other cases of Ezafe construction. The question to address here is if the Ezafe element -e marks an...
متن کاملOn the Importance of Ezafe Construction in Persian Parsing
Ezafe construction is an idiosyncratic phenomenon in the Persian language. It is a good indicator for phrase boundaries and dependency relations but mostly does not appear in the text. In this paper, we show that adding information about Ezafe construction can give 4.6% relative improvement in dependency parsing and 9% relative improvement in shallow parsing. For evaluation purposes, Ezafe tags...
متن کاملPersian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network
Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...
متن کامل